Thanks. Yeah. Yeah.. Um a little bit, but uh I'm m I'm not sure he knows uh everything. So uh basically uh well, first we tried to um explain why in this uh resc uh N_ best list re-scoring uh from the slides to enhance the speech recognition on the meetings data is not working. So um uh that's one thing. And uh so what we did with Alessandro is uh um ch t uh performing some statistical tests to see whatever r if the if the words uh the the appearance of the words during the meeting is uh in independent of the ap appearances of the different slides. So the in the case if it is dependent that would mean that certain words tend to appear during certain slides and that there is a correlation and uh then there will be uh a reason like for believing that uh this will work. The resu the result was no. Yeah, true. Yeah, almost nothing. Mm-hmm. Mm-hmm. Huh. So Matthew we were uh yeah, it's uh we were talking that we did some experiments to see if the word during meeting the words appear independently of the slides in a sense that uh if certain words tend to appear during certain slides or no. And uh the answer is uh no. And uh so we did a statistical test for that. And I will add also that we did that on the uh w the words after filtering them from the removing the the stop words. Yeah, so uh so uh bec s because yeah, with the stop words okay th there would be a reason for uh in independence because there is uh such a huge mess of them. But uh even uh after removing them uh there it it remains uh independent. And actually uh so when I was doing the re-scoring experiments, I was doing that on uh one meeting that I selected as being like the best looking one. Uh it ha it had the m the most uh slides. And now I did the same uh on uh the other meetings which were available where well it's only three meetings because well here there are four of them which are from the test set uh from the AMI recogniser w but only three of them have uh slides. So it's total of three meetings and for the others the tendency and I c 'cause you know I was when I was doing re-scoring, I was uh taking into account the for one given slide also the n neighbouring slides. And as I was in increasing the number of slides which were affecting the the utterance I was re-scoring um the performance was improving sli s uh slightly. Very very little. But uh Yeah. Uh the the first thing we are talking about ex the experiment to see the uh if it's dependent or independent. How we did it? Uh so we used the Pearson uh chi square uh test no. Ah okay. It was to to see if uh certain words have a tendency to appear during uh uh they if they are more likely to appear during certain slides. W Hmm. Mm-hmm. Mm-hmm. Yeah. Yeah. Yeah. And the answer to that was no. And s so when extending the those experiments to the other meetings uh when while I was observing an improvement on the first meeting I was using, uh so I was r observing an improvement as I was increasing um like the um the context. Yeah. Uh On the on the recognition, yeah. So there was uh the improvement was also increasing as I was increasing the context, while on the other meetings it's um it's it's it's really fluctuating and uh i it it uh it looks more like um uh ps uh well, what's the word for that? Statistical um um fluctuation. So uh Mm-hmm. Huh. Mm-hmm. Exactly. So that's why um uh if we look on the relationship that uh exists between the speech and the slides, uh the task of improving the recognition, uh the overall um sp uh speech recognition using the slides uh seems to be uh uh not very uh um good. I mean um we cannot expect much from from it. Mm well for now the uh most effort was uh on uh um showing why i it's i it's it will not work. Especially those experiments and these uh statistical tests. But uh we w were thinking for example of a task like um trying to uh see if uh where a speaker is talking I mean if the speech is actually correlated with the slides uh where the slides uh which happen during the not correlated, but if the Uh well if the um If the speaker is actually talking about what is in the slides or not. Indicator. Yeah, or for example uh what happened for example in one of the meetings is that p people were um summarizing the the results of the previous meeting, you know. So in the beginning so the meeting started with the summary of what happened uh previous time and this was not related to what was on the slides. Link uh But this is one yeah. Huh. Mm-hmm. Yeah, it was recycling uh kind of. Task. Huh. Huh. Even summarization. Or Mm-hmm. Mm yeah. Uh no. No. But for the other thing, you know uh uh the alternative measures. Or maybe we should finish with this first. Uh I mean uh Yeah, this uh this would be a f uh another subject I would like to discuss. S but uh w but uh uh basically ju just, you know, if we finish uh discussing this first. Mm-hmm. This is not clearly defined yet. But uh um something um using this relationship uh between the slides uh and other even other tek textual support and the uh and the speech. Huh. Mm-hmm. Mm-hmm. Mm-hmm. Yeah, sure. Well, this is um the question I'm working on. Mm-hmm. Mm-hmm. Mm yeah. Yeah, that that could be interesting to try to merge those two uh things. Uh yeah. Yeah. Mm-hmm. Mm-hmm. Yeah. Yeah, and I um uh Are they are interested in that or no? Mm-hmm. Mm-hmm. Mm-hmm, yep. Yeah, that's what what is missing. Mm-hmm. Mm-hmm. Mm-hmm.. Mm-hmm. Uh-huh. Mm-hmm. No.. AMI meeting. Mm-hmm. Mm yeah. It's a first step. Hmm. Hmm. And we could uh do automatical uh speaker rating if there are good speakers or bad speakers depending if they just read uh Mm-hmm. Yeah. Mm-hmm. Mm-hmm. Yeah. Mm-hmm. Mm-hmm. Mm-hmm, mm-hmm. Yeah. Yeah. Mm-hmm. Maybe possible someday. Yeah. Mm-hmm. Well uh did trying to really determine the um the um uh subject of something for my proposal I mean. Yeah, aro around February or March. So really um Yeah. So to find something for that. Uh also with the th thinking maybe turning this uh work into publication, the one about the for sure. And uh also uh uh well that uh depends a little on on on you. But try to finish with the um measures and the call routing. 'Cause I was talking with that uh about that to Herve uh since I was uh I will recycle this poster for I_M_ two, you know, about uh about this. And uh that could be cool to to finish it. Maybe fuse it with uh Ian's work and uh do a journal paper he was suggesting. So so you're so what you are saying is that you you noticed that when you um tried other features and uh Because and initially it was a s Nuance, trained on Nuance, uh no? Or Mm-hmm. Run run run what exactly? For the the the call routing itself or the recognition uh the phoneme recognition? Yeah. I have all uh Jean-Yves's uh scripts and uh programs for performing the call uh routing I think uh starting from the phoneme sequences. Yeah, I I know that I don't know how to use the cluster. Yeah, if it's just running a script I don't know if it's But this is basically detecting speech and no non-speech. Yeah. It's this uh problem I hear every time on the speech meeting how do we detect speech and not speech. Ah and uh Or oh okay. It's not just speech and silence. It's speech and and okay. Mm-hmm. Mm-hmm. But who really can do uh when somebody will really be able to do that, I mean detect the speech from non-speech, it will be a little revolution, no, in the Mm-hmm. Yeah, well this is basically in my three uh main jobs uh proposal uh um this. Ah yeah, yeah. Yeah but yeah. Ah yeah sure. But uh uh uh shouldn't make it clearer before on the on the on what uh I I will write. And even uh a first uh uh submission using this work um will make things clearer for me uh writing them down and f for it will help I think for that also. Mm. Mm. Mm-hmm. Mm-hmm. Mm-hmm. Yeah. Well I think uh first uh this submitted and this will help for the proposal. And since uh we cannot start immediately with the measures, I think that's the best thing to do. Mm well no. Well for me it's I think we uh it's fine. Yeah and I think uh that will be exactly one hour and uh uh fill the tape. Uh the tape is sixty minutes. Ah so I was right. Ah yeah. Uh-huh. Mm-hmm. Mm-hmm. Yeah. Yeah.
Okay, we start waiting for people coming late and uh I can say this because in this moment you cannot answer because you don't have the microphone. So I can say whatever and you cannot reply. So yeah.. Ye In my opinion it is no because of the nature of the language. I mean intuitively of course you tend to use more the words that are on the slide., But the mass of the words actually you use are words that are common. Just think that fifty percent of the words on average whatever corpus you take are stop words, articles, et cetera. So in terms of recognition fifty percent goes away. Of the rest uh remaining fifty percent I mean are all words that appear one, two, three times. So in any case even if actually and they tend to be uh related or a statistically related to a single slide, in any case in terms of uh recognition do not help at all, or help a very little. So uh that's the kind of thing uh I mean that's the kind of measure, the ind statistical independence is not on single words. Is overall, which basically means in terms of recognition that doesn't help. Hmm. You know, even if actually I remember I just made some measure uh saying okay l let's look at how many words I don't have in the dictionary and are in the slides. It was two percent. Exactly. Exactly. Exactly. So some in some sense it intuitively it sounds very good. But in practical terms, in terms of recognition uh here we we we have to be clear. If you want to improve the recognition, that im improves very slightly. Uh it is different if you want to have other tasks where the only words that all are words. In that case, even the slight improvements uh he gets at that point, they can make the difference. But does the recognition And some way the statistical independence I think it can be a good explanation to show why it is happen even if it is counterintuitive. But that's that's what happens basically. Yeah. Mm-hmm. Be be b be careful, it's not exactly this because for sure there are words that tend to be. So the point is you want to verify whether you can improve the recognition rate by using as an information the words that uh are contain in the slides. So basically the idea is that uh in the moment you are in the slide somewhere your language change accordingly to the slide. So the fact of having that information somewhere can help you to improve the recognition. But actually it's not what happens. And this does not happen for many reason. First, for example, fifty percent of the words in any kind of text are stop words. So are everywhere. And the remaining a appear so little, that in the case cannot really improve that much. Most of the words we use actually whatever we talk about are common uh words. So how to verify this? I mean this is something that has been measured et cetera but still is a bit qualitative. To have a quantitative measure, mm-hmm, we simply did uh a measure of statistical independence between the words in general, so not some words, the words in general. Otherwise you can see that if a words appear once, basically it is one hundred percent related to one slide, huh? But you have to i if you want to consider in terms of recognition performance,you have to make it overall. I mean what is important is not the word that appear once, it's the word that appear If uh i exactly. If if there is one word that appear once all over the meeting, it appears in correspondence of one slide, then of course it seems to be very related. But very few words uh they represent a very little part of the word mass, mm. So we simply used a very old test, statistical test, that basically measure the hypothesis that some way the probability of having one word in correspondence of a certain slide is simply the product of the probability of the word for the probability of the. That's it. So that's what we did. But is in general the language and not certain words. Because certain words for sure have a strong dependency on the slides. But there are l few. Yeah. Fluctuations? Mm-hmm. It is not in terms I mean for me for example it was not in the sense that uh after working a little bit on language you realise that this kind of thing do not help simply because most of the words uh have nothing to do specifically with the subject you use. Uh most of the words we use uh that's strange, but are simply necessary to build a sentence. There are very few uh content uh words. But it is true that intuitively uh as we are driven in our attention uh I mean in in our understanding we're pretty much driven by attention, that we tend to spot only those words. Exactly, and in some sense intuitively it seems that it can happen, I mean this is not the first attempt to do things like this and it never works actually i in terms of recognition. For other task it can be uh it can be uh helpful. Yeah. Yep. Be careful. Correlate is a dangerous word in statistic. It seems to be the opposite as before. The. Or roughly. Basically uh one thing it can't be done. And and here uh on the contrary for example the few words that are mm how may I say it recognized more can really make the difference. It isn't the case if you want to see if uh I mean what if seen in the meeting and that okay. There are presentation. Then yeah, still the slide is there. But actually the people talk about other things. Or not really other things but the slide is no longer uh a support for the discussion, huh? There are moment that actually this is the support because the people describe, et cetera. And there are moment where it is not. Is a just a background uh thing. And we rough estimated very quickly. But it was one third of the time. At least in in the meeting we have seen. The slide were just there I mean but they were no longer use as a support. It was a discussion rather between people. So in that case the presence or the absence and especially the frequence with which you observe the words that are on the slides can be an excellent uh uh clue, huh. Indication, a clear indication, whether actually does it support of the discussion or not. Uh It can be this. It can be interpreted as a kind of focus of interest. It can be simply interpreted in the sense of saying uh okay I mean you have this channel open there. Do I have to p taken into account or not? Or also in terms of action, yeah uh when you see that s the discussion is completely m disconnected with respect to this some way uh it means that it's happen in something different than before. So I mean it's a kind of feature that in my opinion can be easy to detect it and um that can, yeah, uh f can be interesting to do. Relatively easy to do. And uh in that case, for example, the few words mm that you that you get more can make the difference. Oh so is that mm double way to to show how some way dependent on the task. What do you mean exactly? Mm-hmm. Yeah. Some way well there are f yeah yeah yeah yeah yeah. Yeah. Mm-hmm hmm. Yeah, I know, this we are s s talking about today short term if you want. Uh things just to the use the things. I mean all the work they that has done basically, which is a huge work. And the data we have then uh enough for the for a thesis of course it must be much better. But of course the there's only two two different kinds of words. Some of them you can do it on a single meeting. So this kind of feature extraction because basically that's what it is, saying yes no this channel is good. Now is background. Now is foreground, et cetera. You can put it in that in that way. This you do it on a single meeting. And then there are corpus um based um how may I say it works so yeah, that is one possibility for example, finding the connection between different meetings. Then I don't know, I mean at that point it becomes pretty much crucial the kind of uh data you have in the sense that for the way we have collected the data some way I don't know if it can be at the same time too easy in the sense that some way you have little groups of meetings extremely correlated. So I mean it it makes it easier. I mean I don't know how much significant can be at that level. It can be I in a sense, if you want uh this kind of words become interesting when the corpus is really big uh uh when when you when you consider each meeting as a single item. So it's interesting when you have tens, hundreds, thousands of meetings, which is not the case. But I mean you can do other work if you consider in terms of speaker turns or in terms of slides, and that is immediately multiply. Yeah. Yep, yep. That is another So it can be one more feature that helps in but it is very short-term. I mean just uh well it is ju just few few few times we talk together and it was just, mean, the very quick thing you can can have uh like this. Mm-hmm. But I think in a sense in my opinion well there are two things. When this uh statistical test it can be we have made an experiment we get a result somewhat counter-intuitive we give an explanation. And uh that's it. I mean that's that's something that some way closes here. So when it gives you the kind of answer you get is that okay to go in that in that direction maybe is not the best thing. Maybe to improve the recognition that way is not uh something you can expect. So it gives the possibility to decide to go uh in a different direction that can be more on this uh let's say. This is Hmm. Mm-hmm hmm. Well uh I think the advantage eventually w I mean the advantage with this kind of things is that are much easier. Gaze tracking is still Just with the environment is a difficult thing. I mean it requires cameras with pretty good quality, et cetera. Uh but I think his question is more uh general. A and it says so I mean based on this work, do you figure out uh a direction if they guess correctly. Uh a domain, a direction for your thesis for your exactly. Exactly. Exactly. Exactly. Exactly. One domain, yeah. Not a collection of little uh things completely unrelated. But uh That's very Mm-hmm. Yeah. M but still doesn't solve the problem of saying what I yeah yeah, sure sure. What I what you are going to do for your thesis. I mean as Herve said, I mean basically if you have to explain in four lines what will be the topic of my thesis, the subject of my thesis. And it is something that is not just saying I'm going to do uh this this this and that. I mean because this we actually don't know. But yeah, in a very broad sense I mean w what you're going to investigate, what kind of uh If I interpret correctly. Then uh one basic thing I suggested even thinking about I mean is exactly this stru all these scenarios that can be a meeting, that can be presentation where you have a s a relationship between speech, between things that are said and uh some textual documents some way. Why not try to to to to study this relationship, to improve this relationship, to use this relationship for um mm whatever it is, indexing or annotation or whatever, meta-data extraction. And if you want for example this little work uh about seeing when the slide channel is background and foreground is not beginning of this kind of thing. So a relationship between two channel that provide some kind of information. Mm-hmm. Sure. Sure. Sure, sure, sure, sure. Mm-hmm. No no, be careful. For the recognition in general there is no correlation in the sense that even if you're talking about what is on the slide, most of the words you use most of the words you use are not really correlated to that. Are simply words. And consider also that we all speak English here. And apart of uh native English speakers we have a very limited vocabulary in general. So we tend to use always the same words. But on average, you know, linguistic research shows that on average the people use five hundred words. That's what we have a disposition. And we say everything uh yeah. We say everything with that. And probably we, not a native English speaker, we use even less. So that's gives an idea of why basically in terms of recognition it doesn't really help because you always use anyway the same words. And a few occurrences of words that are there in terms of recognition make one, two percent. So maybe it's not enough to justify uh an effort to improve the recognition. Then you can change of task and use a task, the task we were mentioning about the back channel or foreground channel. In that case the only words that really are important are those words that appear on both slides. At that point, I mean, a little improvement of that, it becomes a big improvement. I mean maybe in terms of recognition it's zero point five percent. But in terms of that task is maybe twenty percent. Yeah. Yeah. That's another interesting point. I mean we are using data that are fake. Any uh in any case they are simulation, no real data. And especially the slides. I think I've seen those slides. I mean they are pretty much artificial. Exactly. But but not pre-prepare. But Yeah, but Yeah, I yeah, I mean uh everything is pretty much artificial. That that's what I mean. So you mention for example this uh fact that sometimes uh speakers tend to read. This is especially true when you have bullet list, huh. Sometimes people really go through. It didn't happen in the data simply because uh uh it was not the same people prepa uh exactly. Ah? So for example it will be possible in any case because the Herve's infrastructure is ready and now we are going to collect the presentation. There maybe we can see The audio is is is very bad. I I thing is really You know, that was collected m a little bit like this without uh too much care. It was it was uh a personal effort of Mael that decided to do it. It was really good to do it. It was very good. But basically it was made without any specific um,you know, without thinking we want to recognize, just to have them. So the especially the audio channel is absolutely awful. Mm-hmm. Yeah. Uh that's uh that's that is one of the reason for example why which uh mark we point pretty much on slides as a mean of indexing. Because at that point there is no interaction between humans and capture. Because whenever you have microphones that you have to move, it's it's it becomes complete uh really Yeah. We are building this slowly. I mean it's it's unfortunate. I mean is is is taking time for the material and so on. But yeah. There will be this. Uh it will be one camera pointing very general but just for display purposes essentially. One uh microphone. And we are thinking to use uh lapel microphone, you know, this this thing. And so that potentially helpful hopefully helpful to to do recognition. And to slides through the projector. Wo Let's say that uh in any case uh w well there are two reason. First of all we are trying to do something that can be easily the the the idea is really it's something that you take you bring somewhere else and it works. So it must be as easy as possible. But in any case I mean the device we are doing has input channels. So some way you can add as many input as we want. So we are dealing now with three. But there are other input uh lines open. Mm-hmm. Mm-hmm. Although at that point we can use this. I mean I was mention the the lapel microphone mm just like this. But if you tell me this, that's very interesting because at this point we can use this kind of things. But I don't why why not uh so there will be other data, their test data. Uh and especially for this kind of problem I think it can be a much more useful kind of data because there is real presentations. I mean is it's reality. You know, it's it's it's uh y n it it it it's exactly. It's more realistic and um yeah, probably it can help go in this direction of saying well let's see what happen between speech and slides. Uh Yeah, yeah, yeah. Yeah. Mm-hmm. Yeah. Yeah, yeah, yeah, yeah. Yeah. Mm-hmm. Yep. Okay, that is cool. Yeah, yeah. That's cool. Uh yeah, yeah. Yeah, it's a it's a European project anyway. So I guess uh it must be available somewhere. Yeah. Mm-hmm. Yeah. Yeah. Yeah. Mm-hmm. Mm-hmm. Mm-hmm. M Yeah, mm-hmm. But only we're measuring one third. It's not At least in no, I'm uh talking now about um the meeting. How presentation, of course. On presentation it it should be one hundred percent, yeah. Yeah yeah yeah, of course. No, that is just something that in my opinion uh with the work that's been already done is something we can get pretty quickly. It's an interesting task. It's more specific on uh meetings. Certainly not for uh for presentation. And it represent a nice task that can be measured, that is clear and Mm yeah. Yeah. Uh i in terms of actual recognition can of course uh help. So it's it's uh but of course not going to be it's not going to be a kind of thesis about this. No, that doesn't make any sense. I mean That uh, in a sense, if you want I mean that that's at least the way I see the that list I have made for my own thesis. I mean uh you define that domain I mean which is uh large, I mean which can be very general. So interaction between uh speech and slides, cool. And then you see how all these things, this two little works, or the statistical independence. And detecting when it is uh are two little works that some way fit in that very general uh framework. Shown from this point of view it n that things mm there are other things, for example one that I really would like to do which is kind of fun. So when you have a bullet list can I click on one of the bullet and get the piece of speech where the guy was talking about that. Because sometimes people just read. Sometimes people uh im improve is more uh so for example that what that's another thing that fits in that kind of uh mm framework. And that's interesting in terms of uh I mean first of all again is measurable in terms of browsing, in terms of retrieval, in terms of indexing, annotation, meta-data extraction. That can be applied in many That would be you know, that uh that's uh that's something uh you're maybe joking now. That's something that can uh be done for example. And uh some way there are parameters that can be measured that tell you that following certain uh criteria uh there is one very easy for example. I had the the the sp there are speaker that never look at the audience. Those are bad speakers in general. It's very wrong. You have to look at the audience. This is one. There is for example, and this is something that eventually can be done. If you have a certain number of words on your slides you need a certain amount of time to read them. The speakers that take a time which is too close to that, they are not good speaker because some way uh um they're talking too fast. They are not uh Yeah, they are not giving you the time. And there is a certain number of things that can be uh some way detected and measured that tell you whether the speaker is actually respecting some form of uh uh. It's a bit uh of having difficult and uh I would not um go that much in that direction. I mean it's uh but you know you know uh Figure, you know Tables, number. Ah, visual things. Well, that's for example with uh we submitted a project with J Jean-Marc um to use that thing as meta-data. And basically one thing for example we're trying to do and and again, this is something that can fit, huh. It will be not uh maybe. But but uh if you get the project, et cetera in any case it's something that Jean-Marc and me we want to do. Uh based on the kind of graphic object you have, so tables, questions, uh figures, plots, et cetera the side one for example, it is a result. Try to get so that's again in terms of structuring, browsing, structure and retrieval, okay when it is a result. When it is an introduction. When it is uh Yeah. Yes, yes, yes, yes, yes, yes. Uh Um It's not really true. It's not really it's not really true. Yeah, yeah, it's not really true. Yeah, that's o it's one of the most funny thing, I mean you can see when you work in this thing. Each one of us has an idea and then it's not true at all. I mean s some people actually are stick very carefully. If the p if they put in the outline uh let's say title one, title two, then you'll you find title one, title two. But many other people, and I am one of those people for example, are much more general. So my outline is something like introduction and experiments, the results, conclusions. You don't find the same title in the slides. It's it's really a different style of uh there are people that don't put, for example, outline. We were uh for example in the um corpus of M_L_M_E_, so the one is on the demo. Fifty percent of the presentation have no outline. In the corpus of the TAM two thousand four, TAM presentation, twenty five percent of the presentation have no outline at all. That's that's uh what yeah, uh w it is. And basically that's that's uh and again, the outline is typical maybe in scientific presentation. Well in that case that majority has. Other kind of presentation probably no. I mean it's it's an habit we have. But um you but again uh I mean unfortunately th n there is nothing true. Nothing true, is is everything is chance pretty much. So uh that was on on possible work. Mm-hmm. Not maybe. For sure. Yeah. Mm-hmm. I see. I know. Yeah yeah. I know, I know, I know. I see. And so essentially say well as there is this silence time which tend to be even more than than how it is, so better to have a kind of silence detection. So something that really takes this and they recognize only that uh that part. This will improve the phoneme uh recognition rate. That's what it of of course, of course, of course. And at that point, even the uh call router that uh Jean Jean-Yves has uh made can improve uh its performance. And it can be used after uh tool for the measure work. Uh That we got. Mm-hmm. Mm-hmm. Mm. Mm-hmm, mm-hmm. Mm-hmm. Sure, sure. Then oh yeah yeah yeah. I know. I know the problem. I see. I see. I see. Yeah, yeah, yeah. Yeah, yeah. I see. I see the problem. I'm not not sure it's. Yeah. I'm losi No. No. Mm-hmm. Mm.. No no. Basically because it was not at uh at a level that could be anyway. So the idea was good. I think, yeah, with a better uh phoneme recognition rate definitely it could uh improve also the the call routing performance. Nope. But we can we can find. Yeah. Yeah, yeah. Yeah, yeah. But we can some way find a guy. I mean this should should be have all details, so it should be relatively easy to at least this is the last news I had and then I can try to get him back. Uh Yeah. No, going very pra pragmatic uh there is a submission uh deadline for example thirty one December for conference on multi-media and expo where I think these two little works on So it can be It's as a proposal. Because as far as I understand this work on the phoneme anyway still takes some development time in the sense that uh no? Okay. I see. Mm-hmm. And Yeah.. Anyway the good news is that there are results. You are you have just to to decide uh, yeah, what what what uh what you want to work first and and and finalise. I think you can really start finalising mm-hmm. Mm-hmm. Mm-hmm. Yeah it's okay. So the tape is ta tape is at the end.
Yeah, sorry about that, Matthew is still on the way. Okay, hang on, reply coming. Yeah, it's okay. U oh, I hate these things. Voila. Okay. So Um we can maybe get a head start. Um have you been talking to Matthew about what you've been doing? Or Yeah. Yeah, yeah. Yeah. Yeah, which was the intuitive feeling. Yeah. Yeah. And is and the r result was no. But uh bu but the question is if is that result no because it's no or is it also just because there's not really enough data to be sure about anything? Sure. Yeah, okay. Yeah. Yeah. Yeah. It's not gonna yeah. Yeah. Yeah, yeah. This is so sort of a feeling that we had but hadn't shown. I think I mean i in uh from uh from an application point of view it might be interesting to make sure that your vocabulary contains all the words on the slides. But from a research perspective it's not interesting. There'd be very few. Well, this is h this is how the dictionary's calculated. It calculated such that if you don't have the word it doesn't affect to a great degree the the the word error rate. So Yeah. Sure. Yep. M like yeah. Yeah. Yeah. But th yeah. No. Well, this is yeah. That's yeah. Yeah. Oh sorry about that. Yeah. Of course. Yeah, yeah. Yep. 'Cause it's not gonna make any difference. Yep. Yep. Hmm. Yeah. Hmm. I mean the it's probably very dependent on the speaker as well. Some people have a habit of just reading what they've got on their slides and some people are com are are well, or or or quite purposefully talk about different things so that they've sort of got multi-modality of you know. But so I mean I guess that's not a surprising result then. The um Yeah. Yeah. S Yeah. Yeah. Yeah, sure. We s go straight to the semantics of the yep. Hmm. Yes. Maybe. Okay. Oh okay. Well I mean uh it just comes down to the to the mere fact that word error rate is just take the words. How many of them did you get right. Take plus or minus two or three or five or ten words doesn't really make any difference. So yeah. Even if those words at the end of the day would be quite important in any sort of search of the transcript. But at the end of the day why wouldn't you just use the slides to search the transcript? I if you have th slides, then yeah. No. Y you'd ju I mean the uh I mean it so I mean have you been doing other things then? Have you have you been thinking about what you would like to to do in in in place of this? Or or or leading on from this, given what you've learnt? 'Cause ours is strict. Yeah. Yeah. Associated? Yeah. Sure. The focus. Yeah, okay. Yep. Okay. M meeting action sort of focus. Yep. Hmm. Yeah. Yep. Yeah. Yeah, sure. Yeah. W Well Yeah. Ye yeah, okay. So I mean uh you could begin with s saying the a um the simple task of just determining whether or not the speech is related to the slide content, and which is sort of building upon what, you know, Dong and others have worked on in terms of, you know, is it discussion, monologue, from the original M_ four data collection. Uh a y are are you also considering that you could actually look at the relationship between meetings for instance? Um well okay, well I mean it if if if these similar sort of phrases or words were discussed in this meeting and were also discussed in the previous meeting, then you sort of have a a link between meetings for instance. I don't know I don't know how this you have to ches the check the statistics. But I I mean 'cause obviously you've gotta think towards what, you know, this this won't occupy ab ab you know, won't occupy you for three years or whatever just on this I guess. But have to look at all the future and so just a s Yeah, yeah. Okay. Yeah. Yeah, okay. Yeah. and yep. Yep. Yeah. Well y yeah. Sure. Yeah. Yeah. I mean uh it's it's and because we don't have all the data collected or annotated yet, it's it's very difficult to know, isn't it. Yeah. Yep, yep, yep. Hmm. Yeah, sure. Or b or as you say, back-channel versus yeah. 'Cause I th think that's quite interesting. I mean 'cause it's an awful lot of speech activity, which is yeah. Sure. Okay. Okay. So I mean as far as that you've got everything to do that, I presume. Um you don't need any extra stuff from us in the immediate future? Or I mean that's good. That's good. But I mean it it Uh yeah I wouldn't get too distracted I guess on d d if you've got two tracks, I'd like to Yeah. Um Yeah. Hmm. But I And uh the k and the question is also are there other sort of more sensible ways of doing it. For instance um gaze tracking. I mean if if people are looking at the slides Yeah, sure. And it has specific to the environment and all that sort of thing. Yeah. Yeah. Yeah. Mm. Uh but I mean for instance it should be linked to the other stuff that you've been doing on um call routing for instance. And and um, you know, uh error merit err error measures, yes. I mean I think I mean I I think that it's possible to draw some sort of relationship between the two, as we were saying. I mean maybe if you you don't measure things in terms of word error rate, maybe this sort of information does actually mean something. Given that fifty percent of the words are not actually interesting. Um so I mean W Well w uh w No, it doesn't solve the problem, but it's f what is it? Yeah. Hmm. Sure. Mm. Yeah. I think I think it's maybe an interesting test scenario and we know that you're sort of basing something around speech and and text. I think there needs to be one sort of extra higher goal above that so that you can motivate future research. Hmm Yeah. Yeah, this was Yeah. For English it's especially low. Yeah. Yeah. Speci Yeah. Yeah. Well another thing you have to remember is these models have been tuned to include AMI data. So I mean perhaps if you had slides and and presentations to do with something that was off topic, you didn't have you hadn't seen any prior data, then you might see a bigger contribution as well. But I mean essentially you've actually sort of included the information from the slides in the language model. Yeah. Okay. Okay, they're pre pre prepared. Okay. Yeah, yeah. It's not there. Okay. Y the audio quality though. Yeah. Yeah. Was good to yeah. Yeah. Ah. Yeah, they had like uh y uh y they, you know, they had this set-up with the uh the microphones. So someone every it there was one microphone shared between every two people. But they had deals with people forgetting to turn the microphone on, forgetting to turn it off, uh f it was it was meant I think they they chose that lecture theatre possibly with the idea that it would be good for collecting data. I mean I uh I'm sure they could have probably had University of Edinburgh host it or whatever. But I th I think the, you know, the H_C_I_ part of it didn't quite work out to how they planned. So ap Yeah. Um it is I mean it's same as us putting on a the headphones and whatever. I mean even that minor thing. I mean any any audio captured during that period is i is rubbish. But there's no sort of system that's really built to d to be able to say when you're you're capturing that. Okay. The Uh what will that incorporate? Will it incorporate like uh an S_M_A_ and uh Yep. Okay. Yeah. Okay. Okay. So no no uh microphone arrays or anything like that. I guess that's quite a that's a quite a bi big deal for setting up the recordings, I guess. Yeah. Yep. As you want, yep. Okay. Yeah, but let's pretend that it's not going to happen in any foreseeable future. Wh uh the I mean I'm just thinking from practical perspectives, you'd you'd have to the the AMI recognition system that we have now is either for these microphones or for the microphone array. There's no lapel system trained for instance. So I mean that in itself, it would I mean you'd have some fun training some more speech recognition models. But Yeah. Yeah. Yeah. Yeah, that's Sure. Yeah. It's what you should be using for this type of task. I think Yeah. And every time we d we generate a new corpus and try to carry out new research we discover sort of areas that are too controlled, you know. I mean the AMI ones were a lot better in term of that than M_ four or whatever. But still. I mean es especially when you get into higher level sort of uh interpretation, it's yeah. Oh well. I'm uh I'm just curious, do the like the ICSI meetings, do they have anything other than the audio? Do they have slides or anything along that I don't think they would. Okay, I'm just thinking whether or not there an are any other sources, I mean maybe CHIL data? 'Cause it's I think it's gonna be more like the TAM type situation. So Mm-hmm.. I don't know. But I mean given that it's coming from the same basis as AMI, I I expect there has to be some sort of relatively f liberal sort of distribution policy. Yeah. Yeah. The one thing about that, and this is something that came up in the last NIST evals, was the fact that it's almost too uni-modal. You've just got one lecturer talking. So once again you don't see the the complex sort of levels of interaction. I mean uh y uh there's no way of getting around that. I mean you just have to think of a a task a challenge that is relevant to that type of recording, which might mm you might use the same methods as what we've been pr talking about today. But the actual sort of what you're trying to measure might need to be slightly slightly different I guess. I don't yeah, you know, in terms of in terms of, say, measuring whether or not they're talking about the slides or off channel, I mean you might find ninety eight percent of your data is directly referring to your slides in a t in a TAM. That is uh is in TAMs. Yeah, but I'm I'm meaning if you move on to these corpora, yeah. So you might wanna be doing something else basically, yeah. Yeah. Yep. Yeah. Yep. Yeah. And uh and at the s and at the same end of the day, you're probably going to be using or learning about the same techniques that you'll need for doing other things along these lines. So Okay. It well it's it's to make a proposal basically for your yep uh This n Okay. Yep. The sp Mm-hmm. Yeah. Sure. Sometimes it's a lot more. Yeah. Yeah. Mm yeah. Yeah. Well, you'd uh sure. At l at at least spot if they forget to talk about something and it won't let them continue on to the next slide until they've gone through all the points. Yeah. Yeah. Yeah. Well uh that that was just through sheer weight of content though. W B yeah. As in five, six, you know number where where the graphemic yeah, well I yeah. Yeah. Although I remember discussing this like um 'cause uh when Neville was working on his his stuff, which was just based on the slides, if um he he asked me all you know, could I pre-segment the slides for him uh, you know, because you wanted to s you wanted to have the topic segmentation. And I said well just look at the the headings. Because anybody generally who writes the slides yeah. It's not okay. It's true for me then. Okay. Yep. Okay. Yeah. Okay. Okay. Uh it yeah, it depends, yeah. That's a stylistic thing as much as anything. Yep. Yeah. Hmm. Mm. Yeah. Which uh whether or not that's a good or a bad thing. It depends on the talk. Yeah, presentation, yeah. Yeah. Yeah. Something we're taught at university from a very young age. the structure. Yeah, yeah, sure. It's all subjective. Hmm. Okay. Okay. Okay. Yeah, yeah. It's quite strictly imposed now. Ye Yeah. After the after the hard drive crash we took the time to actually rebuild the the the system and actually train 'cause originally we sort of had the idea of oh we'll use a recognizer that's been trained on a large amount of telephone speech data and that should give us a better result. I just as a contrast, I trained one specifically on this data. Results were almost identical um which says to me there's something wrong with the data rather than and it's nothing serious. It's simply that it hasn't been um chunked it or or segmented in a way that's typical of speech resources. in terms of um you can have like uh a two second utterance, uh credit cards, and then twenty seconds of silence coming after it, which in which when you f when you first think about it, you just think oh silence, whatever, nothing will happen. But a lot of the normalization that we use Uh yeah, C_M_S_ C_ C_V_N_. Uh so w we need a decent end point detection system running before we can we can we can do that. And we're converging on having one now just in in sort of tandem with this development of this web based recognizer. Um but of course unfortunately all these things take time. Um but I I think we should see a decent sort of improvement then. Not just in terms of no longer having recognition errors whenever silence appears. 'Cause you get that immediate benefit that you don't have any insertion errors. But also it should actually affect the recognition on the speech segments as well. Th The Yeah. Yeah. Mm that's the hard p oh I might uh given the trouble we've had with this corpus I can't make any promises. That's the intention, yep. Sure, sure. We trained the system just on this data. Like there were there were six thousand utterances that weren't used for uh the call routing side of things. I think they th there were thirteen thousand in the entire corpus. Six t half of them roughly we've thrown away. So use that for acoustic training. And the results from that were almost the same as what we got for the C_T_S_ model. Y yeah. But one can imagine a commercial system uh that that that s Steven Cox used from Nuance probably had all these things, built in end point detection, et cetera, et cetera. So Mm. Yeah. Yeah. And there's also just things like tele uh receiver noise at the beginning and or at the end of the and I mean these these are things you can get rid of if it's a Mm. Yep Mm okay, this is f okay. Okay. We can just run some generic yeah. Well I don't mind running it. I've got all the scripts set up. So Well I'm not I'm not sure I see the point actually. Uh Mm. I don't know. Yeah. Whatever, I don't know thing th I don't know whether that a I don't know whether that's a We might as well. 'Cause I mean we've up until now uh you haven't been running any recognition on and neither did Jean-Yves on that on that data. Yeah. Yeah. Oh sure, but I mean the scripts are so automated, it's just a matter of sitting uh uh submitting a job to the cluster and I don't I don't see the point of that. I don't mind running it. That'd it'd be like uh half an hour's work in a day to run the recognition scripts. I mean I I I I know you're saying. Artem should familiarise himself with the A_S_R_ systems or whatever. Fine. But I don't uh I d I don't think this is a point that's worth sort of worrying about at this stage anyway. I mean Well 'cause I'm I'm sort of assuming that that we're gonna have so an end point detection regardless of this working on this Nuance corpus. So We need it for AMI. We need it for um the online demo. For uh all these things, yeah. It's it's a lot more difficult than you would think. Well, it's speech and any and any other class of sound. That's the problem. It's And is And any other class, yeah. If it was just silence it'd be easy. But uh um and in controlled places you can't just assume it's it's, you know, any energy below a certain threshold. And this is what everybody sort of claimed the problem was solved with years ago. But Well, I mean this is once again a Yeah, sure. But I mean this is once again a very constrained task again. I mean it's it's it's you you're almost assured it's only gonna be a single speaker on the microphone. So I mean the the only sorts of noise you're gonna have to deal with uh is is is hiss or impulsive I would say. Or or that would be a good start. No, no. Yeah. yeah. It was showing that. Um you haven't heard from him I take it either? No. Okay. Well it'd be nice to get the thesis I think. There's a sort of just a research report sort of basis for this. Yeah. But Okay. yeah. apply some pressure. Uh n no it should be looking towards the start of next year basically. The is it so I mean you should Okay, which is Yeah. And it's nice to get some feedback on I mean that feedback will probably come after your proposal possibly. But um Yeah. Yeah, yeah, yeah. Exactly. N yeah, sort out any bugs or whatever. Yeah. Yeah. I I I'd prefer to go back to the C_T_S_ models I think as well 'cause I think I think they did work a little bit better. So yeah. Yep. Hayes is another one isn't it? Oh okay, sorry. I was like 'cause I saw you sli I saw you looking at your watch before, and I like wow. Gonna have to s Have this AMI corpus linked in no time. Oh okay. Uh not normally, yeah. It could go to thirty seven, thirty eight. Yeah. Yeah, we've been told. Okay. Fine.
Yeah. Could you ju just quickly rewind? Yeah. Okay. Oh. Okay. Okay. Oh. Okay. Mm-hmm. Mm-hmm. No no. Wait wait. It's going very fast for me actually. thing actually. It's very fast. So wha first point the first thing you told me that you took the slides and you was and you had a l big dictionary and you are seeing uh what what are the inside the slide, right? Uh what the 'cause I first statement I could not understand. Uh when you said the the st uh uh the s uh the w yeah, so we what's so yeah, so could you please tell me little bit more in detail? Yeah. No no no, don't go that. I I want the no, I want to understand what did you mean by the wo uh the te uh thing like saying that I took a dictionary and then a word from the slide and what you are looking for. I could not understand clearly clearly that. Mm mm-hmm. Okay, but do how do why do you want certain words. Like wha how do you expect that certain words? Yeah. Exactly. Exactly. Yeah. Yeah. Okay. Mm-hmm. Okay. Mm-hmm. Mm. Yeah. Mm-hmm. Yeah. It appears in almost uh uh it is if it appears in all the slides it says a hundred percent correlation for me. One percent well. Yeah. Yeah. Yeah. Are like that. Okay.. Okay. Mm-hmm. Mm-hmm. Okay, it's independent of each other. Okay. Okay, fine, that's fine, that's uh uh uh s certain words it should. You have to Okay. Fine. Now I'm uh okay. I'm into the loop. Mm-hmm. Mm-hmm. Context. But it is implemented on the A_S_R_? Or uh where? Okay. Okay. Oh okay. Yeah. Else interpolate, extrapolate. Yeah. Yeah. Uh yeah. Yeah. Yeah. So you mea so uh you mean to say that um like when one case you were saying that if you include the context and all those thing it helps in your A_S_R_ improvement, but it doesn't, which more subjective I think this whole Yeah. Uh-huh. Yes. Yeah. Mm. Mm yeah, you can do that, yeah. Or it could Mm. Mm-hmm. Mm-hmm. Okay. So you so you so okay. So maybe like it can be uh something like in a meeting it can be like an unusual scenario for you, in a presentation with a slide. And the unusual scenario that you're not using the uh slide at all. Mm-hmm. Yeah. What are the other slide, yeah. We have plenty of things to do. What track? Yeah. Go on. Yeah, it's no problem. Okay. So uh well so ultimately now the next step what is that? Because I want like kind of a uh so at first you said that okay now you did this. This correlation studies. And they show that there's a very less correlation or in fact no correlation kind of thing. And uh um uh so uh so what next in on top of it what you are going to build I mean is the question now. Mm. Yeah. Mm-hmm. Yeah, so t I mean to the different direction now that's what I mean to say. What you like to take from there. Uh it's not defined yet, okay. Mm-hmm. But what kind of problem? We need a definition of some kind of uh problem, right? Uh yeah.. Yeah. No, no, no. No but yeah, direction. Yeah, that's that is a thing. We go on to the next direction. It should not that i it should not happen that you're doing so many things and ultimately it doesn't goes into a thesis. The that's that should not the problem. Your thesis should be w in one direction. One problem. And uh should go uh in one or uh one d yeah. It it doesn't make sense. Uh the it may be good as a C_V_ for you, but then as a thesis you will have problems uh defending it and all those things. So what is that uh what would be the next possible direction you want to take. Uh Error measure. It's yeah. Yeah, exactly. Mm. S see that is going to tell us uh something like uh wha the thing is that i uh it it see you can totally go into the text domain and be there, you know? It can be s no problem. But as long as uh it is okay. But if you want to do with uh s like certain something with the speed and all those things together then it's a different scenario. Uh you may not want to work on all the problems. Someone may work on some other problem. But then you you make use of that to extend your work in some. And uh and as I also t it depends like also like I think if uh boss is interested in working in both speech and text, you can understand very well. That part is also there. No, the boss. Like have a it's like instead of saying like you should like work on both side, you know, the text and the speech aspect of it. So yeah. Yeah. Yeah. Yeah. Yeah. Uh w one thing interesting is what we came up in the discussion also you about telling that uh detecting such scenarios where there is no relation between the slide and the person talking, that may be one good starting point for you actually. And then at that point y uh you detect something like that and then you show something like uh what you are saying that i if there is uh if uh there i if it is out of discussion, the words are no way are going to help your recognition anyway, A_S_R_. But if it is more on the um the slide information is not going to help you. But uh if it is uh if it is related to the sli what he is giving a presentation on the slide then you show the your A_S_R_ if you detect this even then you can show that how where you can really help or not help. Because you say there's no correlation now. But then here it comes that there at some point there might be a correlation there. Mm-hmm. Mm-hmm. Yeah. Yeah. So okay.. Mm-hmm. Mm. Oh. Mm-hmm. Okay. Mm-hmm. Mm-hmm. The th sides and the talk. Yeah. Yeah. Okay. Yeah. Like another mic. Th yeah, most of the most of the time you have the title to do, no. The AMI dat the AMI data what that like that only most of the time you knew they had the word. The word you are going to present. Everyone is presenting the same set of slide. Yeah. What about M_L_M_I_? W the one we collected last year. Audio is very bad, right? Yeah. uh one single channel or at least a closed channel collection, no. Okay, okay, okay. It was good. Tha tha tha that Uh okay. So they are is there are still using that uh desktop microphone or what over there. Or Uh Ah. Okay. Okay. That's fine, yeah. Yeah. It's a. Yeah it's a Yeah, I understand. Okay, so now there be uh would be some time wh what is we should be getting a time recording system, right? That uh uh uh so Mm. Okay. Lapel. Should be okay. No, it it's it's uh quite a bit, yeah, problem, to do that kind of thing. Yeah. Mm. Yeah, but probably we were the f you know, the post uh the the thing is somewhere projector. Suppose there's like the the white-board or whatever, the presentation is over there, the speaker is more probably is going to this way or that way. So probably you you may not need a very big microphone either. Probably one or two. uh uh ye uh yeah. Uh yeah. Hmm. Yeah. No the they the the the CHIL may be interesting for you. But it's a l uh well yeah, it may be interesting for your problem, I think. Y yeah. No no no. Yeah. Yeah, it's more like TAM. But uh yeah. The the their collection is much uh uh like they have uh R_T_N_s and all those thing. People sitting and uh it's a lec it's a lecture uh a meeting collection. And I uh is it through n n uh L_D_A_ or something. Mm-hmm. Yeah. So so probably we can yeah. CHIL may be interesting actually. You can look at it. In the text. Mm. Well but uh S Yep. Hmm. Mm-hmm. That's that's a yeah. That that they have. Mm. They're talking fast. I had a problem I like find the answer for my own. Yeah. So what about the figure things? What happens if there's a figures coming on the slide? So what do you want to do with that? On the s uh-huh. Either stop it Yeah. Okay. Okay. Okay. Okay. Okay. Okay. Okay. Yeah. Usually in products and resources. Okay, so what are you going to do between the data when you collect and this? What do you want to do now? By the time you get the data it's going to take another two months if I am right, the whole capturing system when it is going to come. So what is the next step? Yeah, you have to submit it in like two months also again. Yeah, yeah, to submit, yeah. It's one year, right?. For sure. You you need it bad. Call routing where. That's inte. I think we should yeah. This is yeah. Uh h oh. Yeah. Mm yeah. Mm yeah. Uh uh Uh yeah. Yeah. Even if little bit of. Oh. This Uh ye yeah. Yeah. No no, I think you yeah, it's uh it's huge. Yeah. It it it's non-determinist. Yeah, that's what we expect to Yeah. Exa that's the intention there. Sure. Mm ye. No, no, no, no, no. It i Yeah. Surely they should they should have had that actually. So maybe like we had to do this end point detection and probably a little bit of final filtering. And then I think you should be pretty much converging to the system. But the still the reason with the the they had like what are the systems we train on databases are they are chop neatly. That there is not too much silence beginning, not too much silence at the end. They are they are j they they chopped so neatly for the it like custom made database. But uh here it's not like that you know like uh-huh. Yeah. There are the several things. Like one thing is like we can do this begin and end detection, as we said. Or you and he can uh we can try to use uh France Telecom's uh approach actually. And that um uh uh they they have this front end which uh uh which which can do a spectral. And plus it can give you a voice activity detection. Part f part from basis it can give. So somethi we can just yeah, we can just run it quickly and see. And probably you will have to run it. Will give you the idea that is it okay like? Yeah. No no no no. Uh not running, I mean to say p probably uh he should run it or uh like we can ge help him to run it actually. Because at some point he has to get into Uh I mean to say I I mean to say that it's okay. We have all the scripts. But probably uh he should run it I mean to say. It's a The all these experiments probably Uh the they are uh the scri because up to uh probably maybe we shall do mm it yeah. Uh for to make it faster we should ru it run maybe. If we run f we ourself it it's very fast. Uh yeah. Yeah. Okay. Mm okay. No, but I mean to say that if you if you if you if you if you give him the scripts and ask him to run, uh the Uh no. Yeah. But it's Uh Yeah. Yeah, oh yeah. Yeah. Yeah, that uh-huh. No no no no, I the end point system is different thing, no? That's and we need it for AMI plus other thing. And so but the I am to all saying yeah. I I am. It's n it uh no no no, it's not speech non-speech. Uh uh it's uh it's uh it's yeah. Yeah. Any anything nonsense or else it beeped. It's not silent. It's uh you can say that's very yeah. And it can be it can be anything. Yeah. Breathing. Yeah. No anyway, so we okay, so we'll finish this call routing stuff. What do you say all thi this is okay with you, right? So we'll k because I think uh it it it has not been even submitted outside also the work what Jean-Yves did yet. Mm yeah not done that yet. Tried Uh yeah, he still yeah. Uh-uh. Yeah. So yeah, it it's pretty much good time also, no? And he c he can and uh when do you intend to start writing the proposal? Probably you should start, I think. S slow. Little bit start thinking. No no, you you should not wait till the end. That's what I mean to say. Uh yes,. Because you Because it Okay. Mm. Mm-hmm. Uh. Yeah, it can go there, yeah. See No, no, no. I think it's uh it's good if he submits his pr proposal. As a research report you put here, submit it for the conference. It's quite good for the proposal directly. Yeah. Yeah. No, I think uh we we may have to like it can have a uh it's it's kind of like uh uh maybe take like uh uh two or three days, you know, like to sit out and say uh one converge to one approach and then just do without that yeah. And probably yeah. So that is the thing. We have to sort out the bug kind of thing and see how reliable is this end point detection for us. Yeah, mm-hmm. Uh yeah, I think uh then we can go back to C_T_S_ actually. Yeah. Okay. And uh yeah, so that is this. So you wanted to talk something more than that. Well we obstructed you with that. You wanted to talk something different also. No. Or maybe another meeting. And then I say I don't remember.. No, I was just looking this. Okay uh we are not out of time, you know. Yeah. Yeah, yeah, it's happened uh it happened with my friend. Like uh he was taking pictures pictures pictures and it was going thirty seven, thirty eight, forty even.. So I said there's some problem with yours uh your uh uh the way he uh put the roll inside. There is some problem, it has got stuck somewhere. It can never go to forty one, forty Yeah, I uh uh thirty seven, thirty eight is fine. forty, forty one is but see, he has already. Don't listen. Okay.
